Using XML in SQL Server 2008: Relational Data As XML - The FOR XML Modes (part 4) - EXPLICIT Mode

1/10/2011 11:47:44 AM

EXPLICIT Mode

FOR XML EXPLICIT is a powerful, oft-maligned, somewhat daunting mode of SQL Server XML production. It allows for the shaping of row data in any desirable XML structure, but the SQL required to produce it can easily end up being hundreds (or, in some cases, thousands) of lines long, leading to a potential maintenance headache.

With EXPLICIT mode, the query author is responsible for making sure the XML is well formed and that the rowset generated behind the scenes corresponds to a very particular format.

The FOR XML PATH statement renders FOR XML EXPLICIT obsolete except when you need to output column values as CDATA. This section therefore briefly covers the required query structure for and provides an example of this particular case.

Note

It’s not an easy task to understand EXPLICIT mode just by reading. Practice is essential. After you’ve succeeded in using it a few times, it will begin to feel like an intuitive, albeit complex, way of doing things.

Microsoft calls the relational structure behind EXPLICIT mode queries the universal table. The universal table has a hierarchical structure sometimes known as the adjacency list model. Put simply, this means that the first column in the table is the primary key, and the second column is a foreign key referencing it, creating a parent–child relationship between rows in the same table. XML similarly models this relationship through the nesting of elements because nodes contained inside other nodes also hold a parent–child relationship.

Each level of hierarchical depth in the universal table is created by a separate SELECT statement, and each SELECT is unioned to the next, producing the complete rowset. Some details on the table structure help make this clearer:

The first column in the universal table (think of it as the primary key) must be named Tag and hold an integer value. The value of Tag can be thought of as representing the depth of the node that will be produced.
The second column must be named Parent and must refer to a valid value of Tag, or null, in the case of the first branch.
The rest of the selected columns in the query are mapped either to attributes, subelements, or CDATA nodes, or they may be selected but not produced in the resultant XML.

Listing 8 shows a query that returns a universal table. Later, you can change it so that it returns XML by adding FOR XML EXPLICIT.

Listing 8. A Query That Generates the Universal Table Rowset Format

SELECT
    1 as Tag,
    NULL as Parent,
    Reason.ScrapReasonId 'ScrapReason!1!ScrapReasonId!element',
    Name 'ScrapReason!1!!cdata',
    WorkOrderId 'WorkOrder!2!WorkOrderId',
    NULL 'WorkOrder!2!ScrappedQuantity'
FROM Production.ScrapReason Reason
JOIN Production.WorkOrder WorkOrder
ON Reason.ScrapReasonId = WorkOrder.ScrapReasonID
WHERE Reason.ScrapReasonId = 12

UNION ALL

SELECT
    2 as Tag,
    1 as Parent,
    Reason.ScrapReasonId,
    NULL,
    WorkOrderId,
    ScrappedQty
FROM Production.ScrapReason Reason
JOIN Production.WorkOrder WorkOrder
ON Reason.ScrapReasonId = WorkOrder.ScrapReasonID
WHERE Reason.ScrapReasonId = 12

The first SELECT statement in the union must use a special column alias syntax that tells the XML generator how to shape each column. This is the syntax:

Code View: Scroll / Show All

element_name!corresponding_Tag_value!attribute_or_subelement_name[!directive]

The following list explains each part of the preceding syntax:

element_name— The name of the generated element associated with each row.
corresponding_Tag_value— The value of Tag for the context rowset.
attribute_or_subelement_name— The name of the attribute or subelement associated with the column in the context row.
directive— An optional directive to the XML generator. The possible values are
- element—When specified, tells the XML generator to produce the column associated with attribute_or_subelement_name as a subelement. (An attribute is produced by default.)
- hide—Tells the XML generator not to show the associated column data at all in the produced XML. This may be needed if there is some side effect desired from selecting the column but the data does not need to be shown.
- cdata—Tells the XML generator to output the associated column data as a CDATA section.
- xml—Disables entitization of text data. This can lead to non-well-formed XML because the XML special characters (&, ', ", <, >) are output directly.

In all subsequent SELECT statements, the columns corresponding to the rowsets identified by Tag are selected according to the layout specified in the first SELECT.

Notice how in Listing 47.8, NULL is selected for WorkOrder!2!ScrappedQuantity. This is done because the value for that column will be filled in by the SELECTTag value of 2, as specified in corresponding_Tag_value. Likewise, ScrappedQty is selected only in the second SELECT statement (where NULLScrapReason!1!!cdata) because Name is selected in this column in the first SELECT. The primary key (ScrapReasonId) that is the common thread joining both sets of rows must be specified in both SELECT statements for this query to work. statement having a is supplied for

Now that you have an understanding of the universal table structure that must be built, the only thing left to do is add FOR XML EXPLICIT to the query in Listing 8 and then order the output according to the desired element hierarchy. Listing 9 illustrates the final query and its result.

Listing 9. Using FOR XML EXPLICIT

SELECT
    1 as Tag,
    NULL as Parent,
    Reason.ScrapReasonId 'ScrapReason!1!ScrapReasonId!element',
    Name 'ScrapReason!1!!cdata',
    WorkOrderId 'WorkOrder!2!WorkOrderId',
    NULL 'WorkOrder!2!ScrappedQuantity'
FROM Production.ScrapReason Reason
JOIN Production.WorkOrder WorkOrder
ON Reason.ScrapReasonId = WorkOrder.ScrapReasonID
WHERE Reason.ScrapReasonId = 12
UNION ALL
SELECT
    2 as Tag,
    1 as Parent,
    Reason.ScrapReasonId,
    NULL,
    WorkOrderId,
    ScrappedQty
FROM Production.ScrapReason Reason
JOIN Production.WorkOrder WorkOrder
ON Reason.ScrapReasonId = WorkOrder.ScrapReasonID
WHERE Reason.ScrapReasonId = 12
ORDER BY 'ScrapReason!1!ScrapReasonId!element', 'WorkOrder!2!WorkOrderId'
FOR XML EXPLICIT, ROOT('ScrappedWorkOrders')
go
<ScrappedWorkOrders>
    <ScrapReason>
      <ScrapReasonId>12</ScrapReasonId>
      <![CDATA[Thermoform temperature too high]]>
      <WorkOrder WorkOrderId="2573" ScrappedQuantity="14" />
    </ScrapReason>
    <ScrapReason>
      <ScrapReasonId>12</ScrapReasonId>
      <![CDATA[Thermoform temperature too high]]>
      <WorkOrder WorkOrderId="4972" ScrappedQuantity="1" />
    </ScrapReason>
    <ScrapReason>
      <ScrapReasonId>12</ScrapReasonId>
      <![CDATA[Thermoform temperature too high]]>
      <WorkOrder WorkOrderId="7771" ScrappedQuantity="6" />
    </ScrapReason>
    <ScrapReason>
      <ScrapReasonId>12</ScrapReasonId>
      <![CDATA[Thermoform temperature too high]]>
      <WorkOrder WorkOrderId="9071" ScrappedQuantity="1" />
    </ScrapReason>
    <ScrapReason>
      <ScrapReasonId>12</ScrapReasonId>
      <![CDATA[Thermoform temperature too high]]>
      <WorkOrder WorkOrderId="10274" ScrappedQuantity="1" />
    </ScrapReason>
{...}
</ScrappedWorkOrders>

In the ORDER BY clause, you tell the XML generator to first produce ScrapReason elements and then nest the WorkOrder elements underneath them.

Like the other modes, FOR XML EXPLICIT supports the BINARY BASE64 keywords, although base-64 encoding is performed automatically by the parser, even if not specified.

The ROOT keyword can also be used, although not when specifying XMLDATA. XMLSCHEMA is not supported as of this writing. ELEMENTS and XSINIL are also not supported, probably because you can get along without them, thanks to the many shaping options available.